Strategies for sustainable MT for Basque: incremental design, reusability, standardization and open-source

نویسندگان

  • Iñaki Alegria
  • Xabier Arregi
  • Xabier Artola
  • Arantza Díaz de Ilarraza
  • Gorka Labaka
  • Mikel Lersundi
  • Aingeru Mayor
  • Kepa Sarasola
چکیده

We present some Language Technology applications that have proven to be effective tools to promote the use of Basque, a European less privileged language. We also present the strategy we have followed for almost twenty years to develop those applications as the top of an integrated environment of language resources, language foundations, language tools and other applications. When we have faced a difficult task such as Machine Translation to Basque, our strategy has worked well. We have had good results in a short time just reusing previous works for Basque, reusing other open-source tools, and developing just a few new modules in collaboration with other groups. In addition, new reusable tools and formats have been produced.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Wikipedia and Machine Translation: killing two birds with one stone

In this paper we present the free/open-source language resources for machine translation created in OpenMT-2 wikiproject, a collaboration framework that was tested with editors of Basque Wikipedia. Post-editing of Computer Science articles has been used to improve the output of a Spanish to Basque MT system called Matxin. For the collaboration between editors and researchers, we selected a set ...

متن کامل

Developing an Open-Source FST Grammar for Verb Chain Transfer in a Spanish-Basque MT System

This paper presents the current status of development of a finite state transducer grammar for the verbal-chain transfer module in Matxin, a Rule Based Machine Translation system between Spanish and Basque. Due to the distance between Spanish and Basque, the verbal-chain transfer is a very complex module in the overall system. The grammar is compiled with foma, an open-source finitestate toolki...

متن کامل

10.12753/2066-026x-14-000 Strategies and Tools to Enable Reuse in Serious Games Ecosystens and Beyond

Software ecosystems are defined as collections of organizations that are related through software or a software related concept. Within such ecosystems, reusability is fundamental to software sustainability and cost-efficiency. Design for reusability brings the technical promise of high quality software that emerges from clean design, fitness for a purpose and a low defect count. This ensures t...

متن کامل

An Open Architecture for Transfer-based Machine Translation between Spanish and Basque

We present the current status of development of an open architecture for the translation from Spanish into Basque. The machine translation architecture uses an open source analyser for Spanish and new modules mainly based on finite-state transducers. The project is integrated in the OpenTrad initiative, a larger governmentfunded project shared among different universities and small companies, w...

متن کامل

An empirical investigation on the reusability of design patterns and software packages

Nowadays open-source software communities are thriving. Successful open-source projects are competitive and the amount of source code that is freely available offers great reuse opportunities to software developers. Thus, it is expected that several requirements can be implemented based on open source software reuse. Additionally, design patterns, i.e. well-known solution to common design probl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008